深度学习模型在识别医学图像中的发现方面表现出了极大的有效性。但是,他们无法处理不断变化的临床环境,从而带来了来自不同来源的新注释的医学数据。为了利用传入的数据流,这些模型将在很大程度上受益于从新样本中依次学习,而不会忘记先前获得的知识。在本文中,我们通过应用现有的最新持续学习方法介绍了MedMnist收集中连续疾病分类的基准。特别是,我们考虑了三种连续的学习方案,即任务和班级增量学习以及新定义的跨域增量学习。疾病的任务和班级增量学习解决了对新样本进行分类的问题,而无需重新从头开始模型,而跨域增量学习解决了处理源自不同机构的数据集的问题,同时保留了先前获得的知识。我们对表现进行彻底的分析,并研究如何在这种情况下表现出灾难性遗忘的持续学习挑战。令人鼓舞的结果表明,持续学习具有推进疾病分类并为临床环境产生更强大,更有效的学习框架的主要潜力。将公开提供完整基准测试的代码存储库,数据分区和基线结果。
translated by 谷歌翻译
由于其非参数化干扰和灾难性遗忘的非参数化能力,核心连续学习\ Cite {derakhshani2021kernel}最近被成为一个强大的持续学习者。不幸的是,它的成功是以牺牲一个明确的内存为代价来存储来自过去任务的样本,这妨碍了具有大量任务的连续学习设置的可扩展性。在本文中,我们介绍了生成的内核持续学习,探讨了生成模型与内核之间的协同作用以进行持续学习。生成模型能够生产用于内核学习的代表性样本,其消除了在内核持续学习中对内存的依赖性。此外,由于我们仅在生成模型上重播,我们避免了与在整个模型上需要重播的先前的方法相比,在计算上更有效的情况下避免任务干扰。我们进一步引入了监督的对比正规化,使我们的模型能够为更好的基于内核的分类性能产生更具辨别性样本。我们对三种广泛使用的连续学习基准进行了广泛的实验,展示了我们贡献的能力和益处。最值得注意的是,在具有挑战性的SplitCifar100基准测试中,只需一个简单的线性内核,我们获得了与内核连续学习的相同的准确性,对于内存的十分之一,或者对于相同的内存预算的10.1%的精度增益。
translated by 谷歌翻译
神经记忆能够快速适应新任务,只需几个训练样本。现有的内存模型仅从单个最后一层存储特征,在培训和测试分布之间存在域之间的域移位不概括。我们不是依赖扁平内存,我们提出了一种在不同语义层面存储特征的分层替代方案。我们介绍了分层原型模型,其中每个级别的原型从分层内存中获取相应的信息。如果域移位情况如此需要,该模型能够灵活地依赖不同语义级别的功能。我们通过新派生的分层变分推理框架来学习模型,其中分层内存和原型是共同优化的。为了探索和利用不同语义层面的重要性,我们进一步建议以数据驱动方式学习与每个级别的原型相关联的权重,这使得模型能够自适应地选择最概括的功能。我们进行彻底的消融研究,以证明我们模型中每个组件的有效性。在跨领域和传统少量拍摄分类上的跨领域和竞争性能的新的最先进的性能进一步证实了等级变分记忆的益处。
translated by 谷歌翻译
神经过程最近被出现为一类强大的神经潜变模型,这些模型结合了神经网络和随机过程的优势。由于它们可以编码网络功能空间中的上下文数据,因此它们为多任务学习中的任务相关性提供了一种新方法。为了研究其潜力,我们开发多任务神经过程,是多任务学习的神经过程的新变种。特别是,我们建议探索功能空间中相关任务的可转让知识,以提供用于改善每个任务的归纳偏差。为此,我们派生在分层贝叶斯推理框架中的功能前导者,它使每个任务能够将相关任务提供的共享知识结合到预测函数的上下文中。我们的多任务神经工艺方法展开了Vanilla神经过程的范围,并提供了一种探索功能空间任务相关性的新方法,以获得多任务学习。所提出的多任务神经过程能够学习具有有限标记数据和域移位的有限的多个任务。我们对多任务回归和分类任务的几个基准进行了大量的实验评估。结果展示了多任务神经过程在多任务学习任务中转移有用知识的有效性以及多任务分类和大脑图像分割中的优越性。
translated by 谷歌翻译
多任务学习旨在探索任务相关性,以改善各个任务,这在挑战性方案中是特别重要的,只有每个任务只有有限的数据。为了解决这一挑战,我们提出了变分的多任务学习(VMTL),是用于学习多个相关任务的一般概率推断框架。我们将多项任务学习作为变分贝叶斯推理问题,其中通过指定前沿以统一的方式探讨任务相关性。为了将共享知识合并到每个任务中,我们将任务的前期设计为可被学习的其他相关任务的变分后部的混合,这是由Gumbel-Softmax技术学习的。与以前的方法相比,我们的VMTL可以通过联合推断出后视前推断出的方式,我们的VMTL可以以原则的方式利用两个表示和分类器的任务相关性。这使得各个任务能够完全利用相关任务提供的归纳偏差,因此提高了所有任务的整体性能。实验结果表明,所提出的VMTL能够有效地解决各种具有挑战性的多任务学习设置,其中包括分类和回归的有限训练数据。我们的方法始终如一地超越以前的方法,包括强烈的贝叶斯方法,并在五个基准数据集中实现最先进的性能。
translated by 谷歌翻译
Knowledge graph embedding (KGE), which maps entities and relations in a knowledge graph into continuous vector spaces, has achieved great success in predicting missing links in knowledge graphs. However, knowledge graphs often contain incomplete triples that are difficult to inductively infer by KGEs. To address this challenge, we resort to analogical inference and propose a novel and general self-supervised framework AnKGE to enhance KGE models with analogical inference capability. We propose an analogical object retriever that retrieves appropriate analogical objects from entity-level, relation-level, and triple-level. And in AnKGE, we train an analogy function for each level of analogical inference with the original element embedding from a well-trained KGE model as input, which outputs the analogical object embedding. In order to combine inductive inference capability from the original KGE model and analogical inference capability enhanced by AnKGE, we interpolate the analogy score with the base model score and introduce the adaptive weights in the score function for prediction. Through extensive experiments on FB15k-237 and WN18RR datasets, we show that AnKGE achieves competitive results on link prediction task and well performs analogical inference.
translated by 谷歌翻译
Face Anti-spoofing (FAS) is essential to secure face recognition systems from various physical attacks. However, recent research generally focuses on short-distance applications (i.e., phone unlocking) while lacking consideration of long-distance scenes (i.e., surveillance security checks). In order to promote relevant research and fill this gap in the community, we collect a large-scale Surveillance High-Fidelity Mask (SuHiFiMask) dataset captured under 40 surveillance scenes, which has 101 subjects from different age groups with 232 3D attacks (high-fidelity masks), 200 2D attacks (posters, portraits, and screens), and 2 adversarial attacks. In this scene, low image resolution and noise interference are new challenges faced in surveillance FAS. Together with the SuHiFiMask dataset, we propose a Contrastive Quality-Invariance Learning (CQIL) network to alleviate the performance degradation caused by image quality from three aspects: (1) An Image Quality Variable module (IQV) is introduced to recover image information associated with discrimination by combining the super-resolution network. (2) Using generated sample pairs to simulate quality variance distributions to help contrastive learning strategies obtain robust feature representation under quality variation. (3) A Separate Quality Network (SQN) is designed to learn discriminative features independent of image quality. Finally, a large number of experiments verify the quality of the SuHiFiMask dataset and the superiority of the proposed CQIL.
translated by 谷歌翻译
When using LiDAR semantic segmentation models for safety-critical applications such as autonomous driving, it is essential to understand and improve their robustness with respect to a large range of LiDAR corruptions. In this paper, we aim to comprehensively analyze the robustness of LiDAR semantic segmentation models under various corruptions. To rigorously evaluate the robustness and generalizability of current approaches, we propose a new benchmark called SemanticKITTI-C, which features 16 out-of-domain LiDAR corruptions in three groups, namely adverse weather, measurement noise and cross-device discrepancy. Then, we systematically investigate 11 LiDAR semantic segmentation models, especially spanning different input representations (e.g., point clouds, voxels, projected images, and etc.), network architectures and training schemes. Through this study, we obtain two insights: 1) We find out that the input representation plays a crucial role in robustness. Specifically, under specific corruptions, different representations perform variously. 2) Although state-of-the-art methods on LiDAR semantic segmentation achieve promising results on clean data, they are less robust when dealing with noisy data. Finally, based on the above observations, we design a robust LiDAR segmentation model (RLSeg) which greatly boosts the robustness with simple but effective modifications. It is promising that our benchmark, comprehensive analysis, and observations can boost future research in robust LiDAR semantic segmentation for safety-critical applications.
translated by 谷歌翻译
Designing better deep networks and better reinforcement learning (RL) algorithms are both important for deep RL. This work focuses on the former. Previous methods build the network with several modules like CNN, LSTM and Attention. Recent methods combine the Transformer with these modules for better performance. However, it requires tedious optimization skills to train a network composed of mixed modules, making these methods inconvenient to be used in practice. In this paper, we propose to design \emph{pure Transformer-based networks} for deep RL, aiming at providing off-the-shelf backbones for both the online and offline settings. Specifically, the Transformer in Transformer (TIT) backbone is proposed, which cascades two Transformers in a very natural way: the inner one is used to process a single observation, while the outer one is responsible for processing the observation history; combining both is expected to extract spatial-temporal representations for good decision-making. Experiments show that TIT can achieve satisfactory performance in different settings, consistently.
translated by 谷歌翻译
Unbiased learning to rank (ULTR) studies the problem of mitigating various biases from implicit user feedback data such as clicks, and has been receiving considerable attention recently. A popular ULTR approach for real-world applications uses a two-tower architecture, where click modeling is factorized into a relevance tower with regular input features, and a bias tower with bias-relevant inputs such as the position of a document. A successful factorization will allow the relevance tower to be exempt from biases. In this work, we identify a critical issue that existing ULTR methods ignored - the bias tower can be confounded with the relevance tower via the underlying true relevance. In particular, the positions were determined by the logging policy, i.e., the previous production model, which would possess relevance information. We give both theoretical analysis and empirical results to show the negative effects on relevance tower due to such a correlation. We then propose three methods to mitigate the negative confounding effects by better disentangling relevance and bias. Empirical results on both controlled public datasets and a large-scale industry dataset show the effectiveness of the proposed approaches.
translated by 谷歌翻译